When running summary() on the dataset, we see there are 234 items in the mpg dataset. You can not see all of the data on the plot since there is a lot of overlap. To solve this we can add jitter and also some color coding and a legend to make the datapoints more distinguishable by model or some other attribute.
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.4.3
## -- Attaching packages ------------------------------------------------------ tidyverse 1.2.1 --
## v ggplot2 2.2.1 v purrr 0.2.4
## v tibble 1.4.2 v dplyr 0.7.4
## v tidyr 0.7.2 v stringr 1.2.0
## v readr 1.1.1 v forcats 0.2.0
## Warning: package 'tibble' was built under R version 3.4.3
## Warning: package 'tidyr' was built under R version 3.4.3
## Warning: package 'purrr' was built under R version 3.4.3
## Warning: package 'dplyr' was built under R version 3.4.3
## -- Conflicts --------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(nycflights13)
## Warning: package 'nycflights13' was built under R version 3.4.3
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_point()
Width and height parameters control the amount of horizontal and vertical jittering.
Geom_count makes a point more dense depending on the number of datapoints that are at the overlapping x-y coordinate. Jitter makes the points more distinct by adding a small amount vertical or horizontal offset.
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_count()
The default position is “dodge”.
p <- ggplot(mpg, aes(class, hwy))
p + geom_boxplot()
The labs() function is used to modify axis, legend, And plot Labels.
The plot below shows that greater city miles per gallong is linearly related to greater highway miles per gallon. The coord_fixed() is important for keeping a fixed aspect ratio for the plot since both x and y refer to miles per gallon measures. The geom_abline serves as a reference line for annotating the plot and further illustrates the relationship.
ggplot(data = mpg, mapping = aes(x = cty, y = hwy)) +
geom_point() +
geom_abline() +
coord_fixed()
There is a typo in the name of the variable when trying to display it.
my_variable <- 10
my_varıable
## [1] 10
#> Error in eval(expr, envir, enclos): object 'my_varıable' not found
library(tidyverse)
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
filter(mpg, cyl == 8)
filter(diamonds, carat > 3)
eee
filter(flights, arr_delay >= 2)
filter(flights, dest == 'IAH' | dest == 'HOU')
filter(flights, carrier == 'UA' | carrier == 'AA' | carrier == 'DL')
filter(flights, month == 7 | month == 8 | month == 9)
flights %>%
filter(between(dep_time, 0, 600))